********************************************************************************
*** Data Paper (MIPD Release 2.0): Consequences of multiple quasi-responses
***
*** Created: 10/23/23
***
********************************************************************************

use "Multiple Responses/Multiple Responses.dta", clear

********************************************************************************
*** Analyze the multiple response dataset (Individual level)
********************************************************************************

* Number of quasi-responses across individuals
sum num_quasi, det
tab num_quasi

* Of those with multiple quasi-responses, which have multiple specific quasi-response categories?
sum num_diff if num_quasi != 1, det
tab num_diff if num_quasi != 1

********************************************************************************
*** Inferential model of multiple responses
********************************************************************************

use "Multiple Responses/Multiple Responses.dta", clear

drop quasi mipcode

* Generate numeric studyid variable
preserve
	cap drop __* 
	duplicates drop studyid, force
	gen studyid_number = _n
	
	sort studyid
	tempfile sid
	save `sid', replace 
restore

merge studyid using `sid', keep(studyid_number)
drop _merge

* Generate some explanatory variables
gen pid_strength = abs(party_id)

* Generate some binary variables
gen multiple = cond(num_quasi > 1, 1, 0)
gen diff = cond(num_diff > 1, 1, 0)
gen diff_c = cond(num_diff_c > 1, 1, 0)

*** Which demographic characteristics make a respondent more likely to select multiple quasi-responses?
* Study fixed effects
reg multiple age male urban white edulevel income_quartile pid_strength i.studyid_number
reg diff age male urban white edulevel income_quartile pid_strength i.studyid_number if multiple == 1
reg diff_c age male urban white edulevel income_quartile pid_strength i.studyid_number if multiple == 1
